An Information Theoretic Framework for Multi-view Learning

نویسندگان

  • Karthik Sridharan
  • Sham M. Kakade
چکیده

In the multi-view learning paradigm, the input variable is partitioned into two different views X1 and X2 and there is a target variable Y of interest. The underlying assumption is that either view alone is sufficient to predict the target Y accurately. This provides a natural semi-supervised learning setting in which unlabeled data can be used to eliminate hypothesis from either view, whose predictions tend to disagree with predictions based on the other view. This work explicitly formalizes an information theoretic, multi-view assumption and studies the multi-view paradigm in the PAC style semisupervised framework of Balcan and Blum [2006]. Underlying the PAC style framework is that an incompatibility function is assumed to be known — roughly speaking, this incompatibility function is a means to score how good a function is based on the unlabeled data alone. Here, we show how to derive incompatibility functions for certain loss functions of interest, so that minimizing this incompatibility over unlabeled data helps reduce expected loss on future test cases. In particular, we show how the class of empirically successful coregularization algorithms fall into our framework and provide performance bounds (using the results in Rosenberg and Bartlett [2007], Farquhar et al. [2005]). We also provide a normative justification for canonical correlation analysis (CCA) as a dimensionality reduction technique. In particular, we show (for strictly convex loss functions of the form `(w·x, y)) that we can first use CCA as dimensionality reduction technique and (if the multi-view assumption is satisfied) this projection does not throw away much predictive information about the target Y — the benefit being that subsequent learning with a labeled set need only work in this lower dimensional space.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Utilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs

Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...

متن کامل

Information-Theoretic Multi-view Domain Adaptation: A Theoretical and Empirical Study

Multi-view learning aims to improve classification performance by leveraging the consistency among different views of data. The incorporation of multiple views was paid little attention in the studies of domain adaptation, where the view consistency based on source data is largely violated in the target domain due to the distribution gap between different domain data. In this paper, we leverage...

متن کامل

The Effect of Grammar vs. Vocabulary Pre-teaching on EFL Learners’ Reading Comprehension: A Schema-Theoretic View of Reading

This study was designed to investigate the effect of grammar and vocabulary pre-teaching, as two types of pre-reading activities, on the Iranian EFL learners’ reading comprehension from a schema–theoretic perspective. The sample consisted of 90 female students studying at pre-university centers of Isfahan.  The subjects were randomly divided into three equal-in-number groups. They participated ...

متن کامل

A general formulation for learning multi-class posterior probabilities

In this paper, we use partial likelihood (PL) theory to introduce a general formulation for learning multi-class posterior probabilities. The formulation establishes a fundamental information-theoretic connection, the equivalence of partial likelihood maximization and relative entropy minimization, without making the common assumption of independent data samples. We further show that this funda...

متن کامل

Multi-information Ensemble Diversity

Understanding ensemble diversity is one of the most important fundamental issues in ensemble learning. Inspired by a recent work trying to explain ensemble diversity from the information theoretic perspective, in this paper we study the ensemble diversity from the view of multi-information. We show that from this view, the ensemble diversity can be decomposed over the component classifiers cons...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008